Type 1 error rates of the parsimony permutation tail probability test.

نویسندگان

  • Mark Wilkinson
  • Pedro R Peres-Neto
  • Peter G Foster
  • Clive B Moncrieff
چکیده

Archie (1989) and Faith and Cranston (1991) independently developed a parsimony-based randomization test for assessing the quality of a phylogenetic data matrix. Matrix randomization tests have had a mixed reception from phylogeneticists (e.g., Källersjö et al., 1992; Alroy, 1994; Carpenter et al., 1998; Wilkinson, 1998; Siddall, 2001). In general, however, these are well-founded statistical techniques (Manly, 1991) thatmaybewell-suited tophylogenetic contexts where models or assumptions underlying parametric statistical methods are either difŽcult to justify or to test. In a matrix randomization test, a test statistic (typically a measure of data “quality”) is calculated for the original data, and the result is contrasted against a null distribution of the test statistic determined by repeated randomization of the data. Randomization is by random permutation of the assignment of character states to taxawithin each character. Essentially, each character in thedataset is independently shufed so that congruence between characters is reduced to the extent that would be expected by chance alone. The random permutation preserves some features of the data that are known to affect measures of data quality, such as the total number of characters and taxa and the numbers of taxa with each character state within each character (Archie, 1989; Sanderson and Donoghue, 1989; Faith and Cranston, 1991). Thus the null distribution represents a distribution that one would expect from comparable phylogenetically uninformative data. The simplest parsimony-based matrix randomization tests use the length of the most-parsimonious trees (MPTs) as the test statistic, comparing this for real and randomly permuted data. A corresponding simple test statistic for the null hypothesis that the data are indistinguishable from random is the parsimony permutation tail probability or parsimony PTP (Faith and Cranston, 1991). The parsimony PTP is the proportion of data sets (real and randomly permuted) that yield MPTs as short or shorter than the MPTs for the original data. Slowinski and Crother (1998) used 40 real data sets in an empirical evaluation of the utility of the parsimony PTP. SpeciŽcally, they compared PTPs with the fraction of clades supported by bootstrap proportions exceeding 50%. In addition, they compared PTPs with the resolution of strict component consensus trees. They reported that data sets that appear to be poorly structured, based on bootstrap analyses or because they have a poorly resolved strict component consensus, tend to have signiŽcant PTPs, and they concluded that (p. 300) “the PTP test is too liberal” and is of limited utility. Peres-Neto and Marques (2000) expressed concern at the use of one statistical test (the bootstrap) to evaluate another (parsimony PTP) and presented simulation studies that attempted to address the performance of the PTP test more directly. Their simulation studies involved performing PTP tests on randomly generated data. Because data are generated randomly, the null hypothesis is true and the number of times that the null hypothesis is rejected

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified signed log-likelihood test for the coefficient of variation of an inverse Gaussian population

In this paper, we consider the problem of two sided hypothesis testing for the parameter of coefficient of variation of an inverse Gaussian population. An approach used here is the modified signed log-likelihood ratio (MSLR) method which is the modification of traditional signed log-likelihood ratio test. Previous works show that this proposed method has third-order accuracy whereas the traditi...

متن کامل

A Permutation Test for Quantile Regression

A drop in dispersion, F -ratio like, permutation test (D) for linear quantile regression estimates (0 ≤ τ ≤ 1) had relative power ≥1 compared to quantile rank score tests (T ) for hypotheses on parameters other than the intercept. Power was compared for combinations of sample sizes (n = 20 − 300) and quantiles (τ = 0.50 − 0.99) where both tests maintained valid Type I error rates in simulations...

متن کامل

A Location-Scale Permutation Test

A permutation test for the location-scale problem is proposed within the nonparametric combination framework. The test is based on the combination of the permutation version of the Student t test and the permutation version of Pan L50 test for scale. Type-one error rate and power of this test are compared with those of Lepage and Cucconi tests. It is shown that the proposed test is preferable u...

متن کامل

Asymptotically Valid and Exact Permutation Tests Based on Two-sample U-statistics

The two-sample Wilcoxon test has been widely used in a broad range of scientific research, including economics, due to its good efficiency, robustness against parametric distributional assumptions, and the simplicity with which it can be performed. While the two-sample Wilcoxon test, by virtue of being both a rank and hence a permutation test, controls the exact probability of a Type 1 error un...

متن کامل

Type I and II error

Type I error A type I error occurs when one rejects the null hypothesis when it is true. The probability of a type I error is the level of significance of the test of hypothesis, and is denoted by *alpha*. Usually a one-tailed test of hypothesis is is used when one talks about type I error. Examples: If the cholesterol level of healthy men is normally distributed with a mean of 180 and a standa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systematic biology

دوره 51 3  شماره 

صفحات  -

تاریخ انتشار 2002